{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Serialization - saving, loading and checkpointing\n",
    "\n",
    "At this point we've already covered quite a lot of ground. \n",
    "We know how to manipulate data and labels.\n",
    "We know how to construct flexible models capable of expressing plausible hypotheses.\n",
    "We know how to fit those models to our dataset.\n",
    "We know of loss functions to use for classification and for regression,\n",
    "and we know how to minimize those losses with respect to our models' parameters. \n",
    "We even know how to write our own neural network layers in ``gluon``.\n",
    "\n",
    "But even with all this knowledge, we're not ready to build a real machine learning system.\n",
    "That's because we haven't yet covered how to save and load models. \n",
    "In reality, we often  train a model on one device\n",
    "and then want to run it to make predictions on many devices simultaneously.\n",
    "In order for our models to persist beyond the execution of a single Python script, \n",
    "we need mechanisms to save and load NDArrays, ``gluon`` Parameters, and models themselves. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import print_function\n",
    "import mxnet as mx\n",
    "from mxnet import nd, autograd\n",
    "from mxnet import gluon\n",
    "ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving and loading NDArrays\n",
    "\n",
    "To start, let's show how you can save and load a list of NDArrays for future use. Note that while it's possible to use a general Python serialization package like ``Pickle``, it's not optimized for use with NDArrays and will be unnecessarily slow. We prefer to use ``ndarray.save`` and ``ndarray.load``. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = nd.ones((100, 100))\n",
    "Y = nd.zeros((100, 100))\n",
    "import os\n",
    "\n",
    "dir_name = 'checkpoints'\n",
    "if not os.path.exists(dir_name):\n",
    "    os.makedirs(dir_name)\n",
    "\n",
    "filename = os.path.join(dir_name, \"test1.params\")\n",
    "nd.save(filename, [X, Y])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's just as easy to load a saved NDArray."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "A, B = nd.load(filename)\n",
    "print(A)\n",
    "print(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also save a dictionary where the keys are strings and the values are NDArrays."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mydict = {\"X\": X, \"Y\": Y}\n",
    "filename = os.path.join(dir_name, \"test2.params\")\n",
    "nd.save(filename, mydict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "C = nd.load(filename)\n",
    "print(C)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saving and loading the parameters of ``gluon`` models\n",
    "\n",
    "Recall from [our first look at the plumbing behind ``gluon`` blocks](P03.5-C01-plumbing.ipynb]) \n",
    "that ``gluon`` wraps the NDArrays corresponding to model parameters in ``Parameter`` objects. \n",
    "We'll often want to store and load an entire model's parameters without \n",
    "having to individually extract or load the NDarrays from the Parameters via ParameterDicts in each block.\n",
    "\n",
    "Fortunately, ``gluon`` blocks make our lives very easy by providing a ``.save_parameters()`` and ``.load_parameters()`` methods. To see them in work, let's just spin up a simple MLP."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_hidden = 256\n",
    "num_outputs = 1\n",
    "net = gluon.nn.Sequential()\n",
    "with net.name_scope():\n",
    "    net.add(gluon.nn.Dense(num_hidden, activation=\"relu\"))\n",
    "    net.add(gluon.nn.Dense(num_hidden, activation=\"relu\"))\n",
    "    net.add(gluon.nn.Dense(num_outputs))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's initialize the parameters by attaching an initializer and actually passing in a datapoint to induce shape inference."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "net.collect_params().initialize(mx.init.Normal(sigma=1.), ctx=ctx)\n",
    "net(nd.ones((1, 100), ctx=ctx))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So this randomly initialized model maps a 100-dimensional vector of all ones to the number 362.53 (that's the number on my machine--your mileage may vary).\n",
    "Let's save the parameters, instantiate a new network, load them in and make sure that we get the same result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filename = os.path.join(dir_name, \"testnet.params\")\n",
    "net.save_parameters(filename)\n",
    "net2 = gluon.nn.Sequential()\n",
    "with net2.name_scope():\n",
    "    net2.add(gluon.nn.Dense(num_hidden, activation=\"relu\"))\n",
    "    net2.add(gluon.nn.Dense(num_hidden, activation=\"relu\"))\n",
    "    net2.add(gluon.nn.Dense(num_outputs))\n",
    "net2.load_parameters(filename, ctx=ctx)\n",
    "net2(nd.ones((1, 100), ctx=ctx))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Great! Now we're ready to save our work. \n",
    "The practice of saving models is sometimes called *checkpointing*\n",
    "and it's especially important for a number of reasons.\n",
    "1. We can preserve and syndicate models that are trained once.\n",
    "2. Some models perform best (as determined on validation data) at some epoch in the middle of training. If we checkpoint the model after each epoch, we can later select the best epoch.\n",
    "3. We might want to ask questions about our trained model that we didn't think of when we first wrote the scripts for our experiments. Having the parameters lying around allows us to examine our past work without having to train from scratch.\n",
    "4. Sometimes people might want to run our models who don't know how to execute training themselves or can't access a suitable dataset for training. Checkpointing gives us a way to share our work with others."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!-- ## Serializing models themselves\n",
    "\n",
    "[PLACEHOLDER] -->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Next\n",
    "[Convolutional neural networks from scratch](../chapter04_convolutional-neural-networks/cnn-scratch.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "For whinges or inquiries, [open an issue on  GitHub.](https://github.com/zackchase/mxnet-the-straight-dope)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
